Processing Queries and Merging Schemas in Support of Data Integration

نویسندگان

  • Rachel Amanda Pottinger
  • Philip A. Bernstein
چکیده

Processing Queries and Merging Schemas in Support of Data Integration Rachel Amanda Pottinger Co-Chairs of the Supervisory Committee: Affiliate Professor Philip A. Bernstein Department of Computer Science and Engineering Associate Professor Alon Y. Halevy Department of Computer Science and Engineering The goal of data integration is to provide a uniform interface, called a mediated schema, to a set of autonomous data sources, which allows users to query a set of databases without knowing the schemas of the underlying data sources. This thesis describes two aspects of data integration: an algorithm for answering queries posed to a mediated schema and the process of creating a mediated schema. First, we present the MiniCon algorithm for answering queries in a data integration system and explain why MiniCon outperforms previous algorithms by up to several orders of magnitude. Second, given two relational schemas for data sources, we propose an approach for using conjunctive queries to describe mappings between them. We analyze their formal semantics, show how to derive a mediated schema based on such mappings, and show how to translate user queries over the mediated schema into queries over local schemas. We then show a generic Merge operator that merges schemas and mappings regardless of data model or application. Finally, we show how to implement the derivation of mediated schemas using the generic Merge operator.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Target setting in the process of merging and restructuring of decision-making units using multiple objective linear programming

This paper presents a novel approach to achieving the goals of data envelopment analysis in the process of reconstruction and integration of decision-making units by using multiple objective linear programming. In this regard, first, we review inverse data envelopment analysis models for data reconstruction and integration. We present a model with multi-objective linear programming structure in...

متن کامل

Schema Integration and Query Processing forMultiple

In a multiple database system, a global schema created by integrating schemas of the component databases provides a uniform interface and high level location transparency for the users to retrieve data. The main problem for constructing a global schema is to resolve connicts among component schemas. In this paper, we deene corresponding assertions for the database administrators to specify the ...

متن کامل

Specifying Schema Mappings for Query Reformulation in Data Integration Systems

In data integration systems there is a problem of answering queries through a target schema, given a set of mappings between source schemas and the target schema, and given that the data is at the sources. This is of special importance when integrated sources, e.g. from Web data repositories, have overlapping data and its merging is necessary. We propose a language for specifying a class of map...

متن کامل

Towards A Unified Framework For Schema Merging

Merging schemas to create a mediated view is a recurring problem in applications related to data interoperability. The task becomes particularly challenging when the schemas are highly heterogeneous and autonomous. Classical data integration systems rely on a mediated schema created by human experts through an intensive design process. Automatic generation of mediated schemas is still a goal to...

متن کامل

P2P Query Reformulation over Both-As-View Data Transformation Rules

The both-as-view (BAV) approach to data integration has the advantage of specifying mappings between schemas in a bidirectional manner, so that once a BAV mapping has been established between two schemas, queries may be exchanged in either direction between the schemas. By defining public schemas shared between peers, this allows peers to exchange queries via a public schema without the require...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004